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Abstract — This paper studies the wireless two-way relay chan- 
nel (TWRC), where two source nodes, SI and S2, exchange 
information through an assisting relay node, R. It is assumed 
that R receives the sum signal from SI and S2 in one time- 
slot, and then amplifies and forwards the received signal to both 
SI and S2 in the next time-slot. By applying the principle of 
analogue network coding (ANC), each of SI and S2 cancels the 
so-called "self-interference" in the received signal from R and 
then decodes the desired message. Assuming that SI and S2 are 
each equipped with a single antenna and R with multi-antennas, 
this paper analyzes the capacity region of the ANC-based TWRC 
with linear processing (beamforming) at R. The capacity region 
contains all the achievable bidirectional rate-pairs of SI and S2 
under the given transmit power constraints at SI, S2, and R. 
We present the optimal relay beamforming structure as well 
as an efficient algorithm to compute the optimal beamforming 
matrix based on convex optimization techniques. Low-complexity 
suboptimal relay beamforming schemes are also presented, and 
their achievable rates are compared against the capacity with the 
optimal scheme. 

Index Terms — Analogue network coding, beamforming, convex 
optimization, two-way relay channel. 



I. Introduction 

NEtwork coding [1] is a new and promising design 
paradigm for modern communication networks: By al- 
lowing intermediate network nodes to mix the data or signals 
received from multiple links, as opposed to separating them by 
traditional approaches, network coding reduces the amount of 
transmissions in the network and thus improves the overall net- 
work throughput. Recently, there has been increasing attention 
from the research community to apply the principle of network 
coding in wireless communication networks. In fact, wireless 
network is the most natural setting to apply network coding 
due to the broadcast property of radio transmissions, i.e., a 
single transmission of one wireless terminal may successfully 
reach multiple neighboring terminals, without the need of 
dedicated links to these terminals as required in wireline net- 
works. Furthermore, network coding can potentially be a very 
effective solution to the classical "interference problem" in 
wireless networks, since it transforms the traditional approach 
of avoiding or mitigating the interference among wireless 
terminals into a new methodology of interference exploitation. 
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The two-way relay channel (TWRC) is one of the basic 
elements in decentralized/centralized wireless networks. The 
simplest TWRC consists of two source nodes, SI and S2, 
which exchange information via a helping relay node, R. 
Traditionally, in order to avoid the interference at R, simul- 
taneous transmission of SI and S2 is unadvisable at the 
same frequency. Thus, in total four time-slots are usually 
required to accomplish one round of information exchange 
between SI and S2 via R. However, by applying the idea 
of network coding, the authors in [2] proposed a method 
to reduce the number of required time-slots from four to 
three. In this method, SI first sends to R during time-slot 
1 the message si consisting of bits fei(l), . . . , bi{N) with N 
denoting the message length in bits, and then R decodes si. 
During time-slot 2, S2 sends to R the message S2 consisting 
of bits &2(1), • • • ,b2{N), and R decodes S2- In time-slot 3, 
R broadcasts to SI and S2 a new message S3 consisting 
of bits 63(1), . . . , 63(A^) obtained by bit-wise exclusive-or 
(XOR) operations over 6i(n)'s and 52(n)'s, i.e., 63(71) = 
61 (n) ® 62(n),Vn. Since SI knows 5i(7i)'s, SI can recover 
its desired message S2 by first decoding S3 and then obtaining 
b2{nys as 6i(n) h^{n)yn. Similarly, S2 can recover si. 

The principle of network coding has been further investi- 
gated for TWRC by exploiting various physical-layer relay 
operations [3], [4]. The scheme proposed in [3] is named as 
analogue network coding (ANC), while the one in [4] named 
as physical-layer network coding (PNC). For both ANC and 
PNC, the number of time-slots required for SI and S2 to 
exchange one round of information is reduced from three [2] 
to two, by allowing SI and S2 to transmit simultaneously to 
R during one time-slot and thereby combining the first two 
time-slots in [2] into one time-slot. ANC and PNC differ 
in their corresponding relay operations, which are amplify- 
and-forward (AF) and estimate-and-forward (EF), respectively. 
In ANC, R linearly amplifies the sum signal received from 
SI and S2, and then broadcasts the resulting signal to SI 
and S2. ANC is based upon an interesting observation that 
the signal collision at R during the first time-slot is in fact 
harmless, since such a collision can be resolved at SI (S2) 
during the second time-slot by subtracting from its received 
signal the so-called self-interference, which is related to the 
previously transmitted message from S 1 (S2) itself. In contrast 
to ANC, more sophisticated (nonlinear) operations than AF are 
required at R for PNC [4] -[7]. Instead of decoding messages 
si from SI and S2 from S2 separately in two different time- 
slots like in [2], the EF method proposed in [4] estimates 
at R the bitwise XORs between fei(n)'s and &2(?^)'s from 
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the mixed signal of SI and S2, and re-encodes the decoded 
bits into a new broadcasting message S3; each one of SI and 
S2 then recovers the other's message by the same decoding 
method as that in [2]. Alternatively, it is possible to first deploy 
multiuser decoding at R to decode si and S2 separately, and 
then jointly encode si and S2 into a new broadcasting message 
S3; given the side information on si (S2) at SI (S2), SI 
(S2) decodes S2 (si). The above decode-and-forward (DF) 
relay operation for TWRC has been studied in [8], [9]. On 
the other hand, TWRC has also been studied in [10]-[13] 
from cooperative communication perspectives, with a major 
objective to compensate for the loss of spectral efficiency 
in the conventional one-way relay channel (OWRC) owing 
to the half-duplex constraint. Non-surprisingly, the solutions 
proposed therein are similar to those inspired by the principle 
of network coding. 

Furthermore, TWRC has been studied jointly with 
other physical-layer transmission techniques based on, 
e.g., orthogonal-frequency-division-multiplexing (OFDM) 
[14], [15], and multiple transmit and/or multiple receive 
antennas [16]-[19], to further improve the bidirectional relay 
throughput. For the multi-antenna TWRC, the DF relay 
strategy was studied in [16], [17], the AF relay strategy or 
ANC was studied in [18], and the distributed space-time 
coding strategy for the relay was studied in [19]. In this 
paper, we focus on the AF-/ANC-based multi-antenna TWRC. 
Assuming that SI and S2 each has a single antenna and R 
has M antennas, M > 2, we study the optimal design of 
linear processing (beamforming) at the relay to achieve the 
capacity region of AF-/ANC-based TWRC, which consists 
of aU the achievable rate-pairs of SI and S2 under the given 
transmit power constraints at SI, S2, and R. Our main goal 
is to provide insightful guidelines on the design of AF-based 
multi-antenna TWRC, which differs from the results for the 
conventional AF-based multi-antenna OWRC given in, e.g., 
[20] -[221. The main results of this paper are summarized as 
follows!^ 

• We derive the optimal beamforming structure at R, which 
achieves the capacity region of an ANC-based TWRC. 
The optimal structure reduces the number of complex- 
valued design variables in the relay beamforming matrix 
from AP to 4 when M > 2. Furthermore, by trans- 
forming the capacity region characterization problem into 
an equivalent relay power minimization problem under 
certain signal-to-noise-ratio (SNR) constraints at SI and 
S2, we derive an efficient algorithm to compute the 
globally optimal beamforming matrix based on convex 
optimization techniques. 

• Inspired by the optimal relay beamforming structure, 
we propose two low-complexity suboptimal beamform- 
ing schemes, based on the principle of "matched-filter 
(MF)" and "zero-forcing (ZF)", respectively. We analyze 
their performances in terms of the achievable sum-rate 
in TWRC against the maximum sum-rate, or the sum- 
capacity, achieved by the optimal scheme. It is shown 
that the ZF-based relay beamforming with the objective 

'Preliminary versions of this paper have been presented in [23], [24]. 



of suppressing the uplink (from SI and S2 to R) and 
downlink (from R to SI and S2) interferences at R may 
not be a good solution for the ANC-based TWRC, since 
these interferences are indeed self-interferences and thus 
can be later removed at SI and S2. On the other hand, it 
is shown that the MF-based relay beamforming, which 
maximizes the signal power forwarded to SI and S2, 
achieves the sum-rate close to the sum-capacity under 
various SNR and channel conditions. 
The rest of this paper is organized as follows. Section |ll] 
describes the TWRC model with ANC. Section Hill studies the 
capacity region of the ANC-based TWRC, derives the optimal 
structure for relay beamforming, and proposes an algorithm to 
compute the optimal beamforming matrix. Section ITVlpresents 
the low-complexity suboptimal relay beamforming schemes. 
Section [V] analyzes the performances of both the optimal and 
suboptimal relay beamforming schemes in terms of the achiev- 
able sum-rate in TWRC. Section [VT] shows numerical results 
on the performances of the proposed schemes, in comparison 
with other existing schemes in the literature. Finally, Section 
IVIII concludes the paper. 

Notation: Scalars are denoted by lower-case letters, e.g., x, 
and bold-face lower-case letters are used for vectors, e.g., x, 
and bold-face upper-case letters for matrices, e.g., X. In addi- 
tion, tr(5'), \S\, S~^, and S"^ denote the trace, determinant, 
inverse, and square-root of a square matrix S, respectively, and 
diag(S'i, . . . , S'a/) denotes a block-diagonal square matrix 
with Si,...,Sm as the diagonal square matrices. S ^ 
means that S is a positive semi-definite matrix [25]. For an 
arbitrary-size matrix M, M^, M*, , and denote the 
transpose, conjugate, conjugate transpose, and pseudo inverse 
of M, respectively, M{i,j) denotes the {i,j)-th element of 
M, and rank(Af) denotes the rank of M. I and denote 
the identity matrix and the all-zero matrix, respectively. ||a;| 
denotes the Euclidean norm of a complex vector x, while 
\z\ denotes the norm of a complex number z. C^^y denotes 
the space of x x y matrices with complex-valued elements. 
The distribution of a circular symmetric complex Gaussian 
(CSCG) random vector with mean x and covariance matrix 
S is denoted by CJ\f{x, S), and ~ stands for "distributed as". 

II. System Model 

As shown in Fig. [T] we consider a TWRC consisting of 
two source nodes, SI and S2, each with a single antenna 
and a relay node, R, equipped with M antennas, M > 2. 
All the channels involved are assumed to be flat-fading over 
a common narrow-band. It is assumed that the transmission 
protocol of TWRC uses two consecutive equal-duration time- 
slots for one round of information exchange between SI and 
S2 via R. During the first time-slot, both SI and S2 transmit 
concurrently to R, which linearly processes the received signal 
and then broadcasts the resulting signal to SI and S2 during 
the second time-slot. It is also assumed that perfect synchro- 
nization has been established among SI, S2, and R prior to 
data transmission. The received baseband signal at R in the 
first time-slot is expressed as 

Vj^in) = hi^fplsi{n) + /i2\/P2S2(") + Zji{n) (1) 
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Fig. 1. The two-way multi-antenna relay channel. 

where yf;{n) E c^^xi the received signal vector at symbol 
index n, n = 1, . . . ,N, with N denoting the total number 
of transmitted symbols during one time-slot; hi e C^^^^ 
and h2 G C^^^^ represent the channel vectors from SI to 
R and from S2 to R, respectively, which are assumed to be 
constant during the two time-slots; and si(n) and S2{n) are 
the transmitted symbols from SI and S2, respectively. Since 
in this paper we are interested in the information-theoretic 
limits of TWRC, it is assumed that the optimal Gaussian 
codebook is used at SI and S2, and thus Si(n) and S2{n) 
are independent random variables both ^ CJ\f{0, 1); pi and 
P2 denote the transmit powers of S 1 and S2, respectively; and 
ZR{n) e C^^^^ is the receiver noise vector, independent over 
n, and without loss of generality (w.l.o.g.), it is assumed that 
ZR{n) ~ CJ\f{0, 1),yn. Upon receiving the mixed signal from 
S 1 and S2, R processes it with AF relay operation, also known 
as linear analogue relaying, and then broadcasts the processed 
signal to SI and S2 during the second time-slot. Mathemat- 
ically, the linear processing (beamforming) operation at the 
relay can be concisely represented as 



Xfiin) = Ay^(ri), n = 1, . . . , TV 



(2) 



where Xfi{n) G C*^^^ is the transmitted signal at R, and 
A S C*^^*^ is the relay processing matrix. 

Note that the transmit power of R can be shown equal to 

Pr{A) = tr {xn{n)x'i^{n)) 

= \\Ahi\\^pi + \\Ah2\\^P2+triAA"). (3) 

We can assume w.l.o.g. that channel reciprocity holds for 
TWRC during uplink and downlink transmissions, i.e., the 
channels from R to S 1 and S2 during the second time-slot are 
given as and h^, respectively^ Thus, the received signals 
at SI can be written as 



2/1 (n) = hiXR{n) + zi{n) 

= h'^ Ahiy/p^si{n) + h{ Ah2^/pis2{r 
+h^Azji{n) + zi{n) 



(4) 



for n = l,...,iV, where zi(n)'s are the independent re- 
ceiver noise samples at SI, and it is assumed that zi{n) ^ 
C7V(0, 1), Vn. Note that on the right-hand side (RHS) of dUi, 

-This assumption is made merely for the purpose of exposition, and the 
results developed in this paper hold similarly for the more general case with 
independent uplink and downlink channels. 



the first term is the self-interference of SI, while the second 
term contains the desired message from S2. Assuming that 
both Ahi and Ah2 are perfectly known at SI via 
training-based channel estimation [26] prior to data transmis- 
sion, SI can first subtract its self-interference from yi{n) 
and then coherently demodulate S2{n). The above practice 
is known as analogue network coding (ANC) [3]. From dU, 
subtracting the self-interference from yi{n) yields 



yi{n) = h2iy/p^S2{n) + zi{n), n = 1, . 



(5) 



where hi = Ah2, and zi{n) ~ CM{Q,\\A" hlW^ + 1). 
From (|5]l, for a given A, the maximum achievable rate (in 
bits/complex dimension) for the end-to-end link from S2 to 
SI via R, denoted by r2i, satisfies 



r2i < ^ log2 ( 1 



\hiAh2\^P2 

\A"hlP + l 



(6) 



where the factor i is due to the use of two orthogonal time- 
slots for relaying. Similarly, it can be shown that the maximum 
achievable rate r^i for the link from SI to S2 via R satisfies 



ri2 < ^ loga 1 1 



\A"h;\\^ + i 



(7) 



Next, we define the capacity region of ANC-based TWRC, 
C(Pi, P2, Pr), subject to transmit power constraints at SI, S2, 
and R, denoted by Pi, P2, and Pr, respectively. First, for a 
fixed pair of pi and p2, pi < Pi and p2 < P2, we define the 
achievable rate region for SI and S2 as 

n{pi,p2,PR)^ U {(r2i,ri2) : I©,©}. (8) 

A: Pr{A)<Pr 

Then, C(Pi, P2, Pr) is defined as 

C(Pl,P2,Pi?) = U n{pi,P2,PR). (9) 

(pi.P2):pi<-Pl,P2<-P2 

Note that in (|9]l, C(Pi, P2, Pr) can be obtained by taking the 
union over all the achievable rate regions, TZ{pi,p2, PrYs, 
corresponding to different feasible pairs of pi and p2. Thus, for 
the rest of this paper, we focus our study on characterization 
of Tl{pi,p2, Pr) for some fixed pi and p2- Also note from 
(HJ that the relay beamforming matrix A plays the role of 
realizing different rate tradeoffs between r2i and ri2 on the 
boundary of TZ{pi,p2, Pr)- 

For the convenience of later analysis in this paper, we ex- 
press (HJi into an equivalent matrix form as follows, combined 
with 7/2 ('^)'s and Z2(7i)'s defined for S2 similarly as for SI. 



y2{n) 

yi{n) 



HmAH 



UL 



PiSi(n) 

'P2S2{n) 



+ 



Z2(n) 
zi(n) 



Hj3i^AzR{n) 



(10) 



where ffuL = [hi,h2] G C^^^^ and ifoL = [h2,hiY e 
(^2xM (jenote the upUnk (UL) and downlink (DL) channel 

matrices, respectively. Note that i?DL 

'0 1 ^ 
1 



FiJuL , where F 
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III. Capacity Region Characterization 

In this section, we study the capacity region of ANC -based 
TWRC by characterizing Tl{pi,p2, Ph) defined in (|8]i for a 
given set of pi, p2, and Pr. First, we derive the optimal 
relay beamforming structure for A that attains the boundary 
rate-pairs of TZ{pi,p2, Pr). It is shown that with the optimal 
beamforming structure, the number of unknown complex- 
valued variables to be sought in A is reduced from AP to 
4 when M > 2. Then, we formulate the optimization problem 
and present an efficient algorithm to compute the optimal A's 
to achieve different boundary rate-pairs of TZ{pi,p2, Pr). 

A. Optimal Relay Beamforming Structure 

Let the singular-value-decomposition (SVD) of i?uL be 
expressed as 



H 



UL 



H 



(11) 



where U £ 



•<Mx2 



, S = diag((7i, (T2) with (Ji > (T2 > 0, and 



V G C2^2 jt thus follows that iJoL = FV*T,U' . We then 
have the following theorem. 

Theorem 3.1: The optimal relay beamforming matrix. A, 
that attains a boundary rate-pair of TZ{pi,p2, Pr) defined in 
dS) has the following structure: 



A = U*BU 



H 



(12) 



where B £ C^^^ is an unknown matrix. 

Proof: Please refer to Appendix I] ■ 

Remark 3.1: In the conventional AF-based multi-antenna 
OWRC, the optimal beamforming structure at relay to max- 
imize the end-to-end channel capacity has been studied in, 
e.g., [20], [21]. Applying the results therein to the OWRC 
with S2 transmitting to SI via R yields the optimal A to 
maximize r2i in (|6]l as A21 = C2ih-i/i^, where C21 is a 
constant related to Pr. Similarly, the optimal A to maximize 
ri2 in ^ for the OWRC from SI to S 2 via R is in the 
form of A12 ~ Ci2h2hi . It then follows that A21 differs 
from A12 unless hi ~ vh2 for some constant v, i.e., hi 
and h2 are parallel. Therefore, relay beamforming designs 
for the OWRC with separate unidirectional transmissions in 
general can not be applied to the TWRC with simultaneous 
bidirectional transmissions. As observed from Theorem 13.11 
the optimal relay beamforming matrix for TWRC lies in the 
space spanned by both hi and h2. 

Let = U"hi e C^^^ and = U" h2 G C^^^^ be 
the "effective" channels from SI to R and from S2 to R, 
respectively, by applying the optimal structure of A given in 
(fT2l i. Similarly, g{ and become the effective channels from 
R to SI and S2, respectively. TZ{pi,p2, Pr) in (O can then be 
equivalently re-expressed as 



u 



(r2i,ri2) : r2i < g ^^Sa 



1 + 



\gjBg2fp2 



+ 1 



ri2 < - log2 



Iffa-Bgil pi 



(13) 



where pr{B) = \\Bgi\\^pi + ||Bg2ll'P2 + tr(BJB^). Note 
that the not-yet-determined parameter in ( fT3] l is B. Since B 
has 4 complex-valued variables as compared to AP in A, 



the complexity for searching the optimal B corresponding to 
a particular boundary rate-pair of 'R-{pi,p2, Pr) is reduced 
when AI > 2. Using Theorem 13. H and (fT3| i, optimal structures 
of A can be further simplified in the following two special 
cases, which are Case I: /ii_Lh,2, i-e., h^h2 = 0; and Case 
II: hi II h,2, i.e., hi ~ vh2 with u being a constant. 

Lemma 3.1: In the case of ft,i_L/i2, the optimal structure of 
A is in the form of A = t7* |J q U" , with c > and 
d>Q. 

Proof: Please refer to Appendix |ll] ■ 
Lemma 3.2: In the case of hi \\ h2, the optimal structure 



U", with a > 0. 



of A is in the form of A = U* q q 
Proof: Please refer to Appendix |III 
Note that in other cases of hi and h2 beyond the above 
two, we do not have further simplified structures for J5, or 
A upon that in (fT2] |. Thus, in general we need to resort to 
optimization techniques to obtain the 2x2 matrix B for each 
boundary rate-pair of TZ{pi,p2, Pr), as will be shown next. 

B. Optimization Problems 

Since TZ{pi,p2, Pr) in ( flJl l is the same as that in (|8]l, we 
use ( fT3T l in this subsection to characterize all the boundary 
rate-pairs of TZ{pi , p2 , Pr). A commonly used method to char- 
acterize different rate-tuples on the boundary of a multiuser 
capacity region is via solving a sequence of weighted sum- 
rate maximization (WSRMax) problems, each for a different 
(nonnegative) rate weight vector of users. In the case of 
TWRC, let w = [w2i, ^12]^ be the weight vector, where W21 
and W12 are the "rate rewards" for r2i and ri2, respectively. 
From (fT3T l. we can express the WSRMax problem to determine 
a particular boundary rate-pair of TZ{pi,p2, Pr) as 



Max. 
B 



W21 



l0g2 1 



|B^gt||2 + l 



W12 



l0g2 1 



\B"gl\\^ + } 



s.t. \\Bgifpi + \\Bg2fp2+tr{BB")<PR. (14) 

In the above problem, although the constraint is convex, the 
objective function is not a concave function of B. As a result, 
this problem is non-convex [25], and is thus difficult to solve 
via standard convex optimization techniques. 

Therefore, we need to resort to an alternative method of 
WSRMax to characterize TZ{pi,p2, Pr). In [27], an interesting 
concept so-called rate profile was introduced to efficiently 
characterize boundary rate-tuples of a capacity region. A rate 
profile regulates the ratio between each user's rate, r^, and 
their sum-rate, i?sum ~ X^fc^i ^fc, to be a predefined value 
afc, i.e., j^'' = C(k,k = 1,...,K, with K denoting the 
number of users. The rate-profile vector is then defined as a = 
[ai, . . . , aji]'^ . For a given ct, if i?sum is maximized subject 
to the rate-profile constraint specified by ct, the solution 
rate-tuple, RsumCy-, can then be geometrically viewed as the 
intersection of a straight line specified by a slope of a and 
passing through the origin of the capacity region, with the 
capacity region boundary. Thereby, with different ct's, all the 
boundary rate-tuples of the capacity region can be obtained. 
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Next, we show that by applying the above method based 
on rate profile, boundary rate-pairs of TZ{pi,p2, Pr) can be 
efficiently characterized. Since in our case Tl{pi,p2, Pr) lies 
in a two-dimensional space, we can express the rate-profile 
vector as q; = [q!21, 012]'^, where a2i = j^^^ , Q!i2 = , 
and i?sum = f2i+?'i2- For a fixed a, we consider the following 
sum-rate maximization problem: 

Max. Rsum 

\\Bg,f-pi + \\Bg^fp2 + tr{BB") < Pr. (15) 

After solving the above problem, solution of S can be used to 
construct the optimal relay beamforming matrix A according 
to (fT2] |. and the solution RaumOt becomes the rate-pair on the 
boundary of TZ{pi,p2, Pr) corresponding to the given a. To 
solve problem ( fTSl l. we first consider the following relay power 
minimization problem subject to rate constraints: 

Mm. PR := \\Bg,fpi + \\Bg^fp2 + tr{BB") 

^ 1 , \gTBg,\^p2 \ ^ 
s.t. - log, 1 H > a2ir 



- logo 1 H T7 > aur. 

2 V \\B"g*\\^ + l ' - 



(16) 



If the above problem is feasible, its optimal value, denoted by 
p*j^, will be the minimum relay power required to support the 
given rate-pair rex.; otherwise, there is no finite relay power 
that can support this rate-pair, and for convenience we denote 
p*j^ = +00 in this case. Problems (fTSI l and ( fT6] l are related as 
follows. If for some given r, a and Pr, the optimal value of 
problem (fTSl l satisfies that > Pr, it follows that r must 
be an infeasible solution of Rsum in problem (fTsT i, i.e., the 
rate-pair ra must fall outside TZ{pi,p2, Pr) along the line 
specified by slope a; if < Pr, it follows that r is a feasible 
solution of i?sum and thus ra must be within TZ{pi,p2, Pr). 
Based on the above observations, we obtain the following 
algorithm for problem (fTsT l. for which a rigorous proof is given 
in Appendix II VI 
Algorithm 3.1: 

. Given i^^um e [0,^sum], OL. 



• Initialize r„ 

• Repeat 



0, r„ 



I. Set i(r„ 



.)■ 



2. Solve problem ( fTSl l to obtain its optimal value, p*p,. 

3. Update r by the bisection method [25]: If < Pr, 
set r,nin ^ r; otherwise, rmax ^ r. 

• Until r,„ax ~ ''min < <^r> where 5,. is a small positive 
constant to control the algorithm accuracy. The converged 
value of ?-„iin is the optimal solution of i?sum in ( fTSl l. 
Note that ^sum is an upper bound on the optimal solution 
of ^sum in ( fTSi ) for the given a.. In Section |IV] (see Remark 
15.11 ). we obtain such an upper bound that is valid for all 



possible values of a. In the next subsection, we will address 
the remaining part in Algorithm [3]T] on how to solve problem 
(O in Step 2. 

C. Power Minimization under SNR Constraints 

Denote 71 and 72 as the SNRs at the receivers of SI and 
S2, respectively, which are defined as 



71 



\gTBg2\^P2 
\B"gir + l 



72 



\g^Bg,\^Pi 
\B"g*2\\' + l 



(17) 



Let 71 ^ 22"2i'-_i and 72 = 2^"^^' -! be the equivalent SNR 
targets at SI and S2 to guarantee the given rate constraints. 
Then, it is observed that the rate constraints in (fTSl l can be 
expressed as the corresponding SNR constraints at S 1 and S2, 
7i > 7i and 72 > 72, respectively. Using ( fTTb . problem ( fTSb 
can be recast as the following equivalent problem: 



Min. 
B 

s.t. 



PR := WBg^ 
\9lBg,{' 
\9^Bg,\^ 



> 



> 



W Pi 

P2 
72 I 

Pi 



\Bg2fp2 



tr{BB") 



B"gl\ 



B^'gll 



71 

P2 
72 

Pi' 



(18) 



Note that the above problem may be of practical interest 
itself, since it is relevant when certain prescribed transmission 
quality-of-service (QoS) requirements in terms of receiver 
SNRs need to be fulfilled at SI and S2. For the convenience 
of analysis, we modify the above problem as follows. First, 
let Vec(Q) be a K'^ 



1 vector associated with a K x K 
square matrix Q = [q^ , . . . , q^-]-^, where q^, G C^^^,k = 
1, . . . , K, by the rule of Mec{Q) = [qj , . . . , q]^]^. Next, 



with b ~ Vec(S) and = PidiQi + ^25292 + we 
can express pR in the objective function of (fTSI l as pR 



tr{B&B") = where * = (diag(0^, 0^))* 



Similarly, let = Vec 



and /2 = Vec (9291 



Then, from (|T8]l it follows that IqfSgjP = 
IgJJBgjP = 1/2^ ''P- Furthermore, by defining 

" 9,(1,1) q,(2,l) 
9,(1,1) 9,(2,1) 



G 



\fib\' and 



1,2, 



we have \\B"g*\\'^ = \\G,b\\\i = 1,2. Using the above 
transformations, (fTSjl can be rewritten as 



Min. 
b 



PR 



S.t. l/fbp > ^||Gi6 



71 1 



f^bf > ^\\G2b 



P_2 
72 I 



Pi 



71 

P2 
72 

Pi 



(19) 



The above problem can be shown to be still non-convex. How- 
ever, in the following, we show that the exact optimal solution 
could be obtained via a relaxed semidefinite programming 
(SDP) [25] problem. 



We first define Eq = Ei = ^^flfl - G"Gi, and 



En 



72 



/;/2 



G2 G2- 



71 ^ 

Since standard SDP formulations 



only involve real variables and constants, we introduce a new 
real matrix variable as X = [6/?; 6/] x [b^; bi]^ , where bR = 
Re{b) and b/ ~ Im{b) are the real and imaginary parts of 
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b, respectively. To rewrite the norm representations at ( fT9] l in 
terms of X, we need to rewrite Eq, Ei, and E2, as expanded 
matrices Fq, Fi, and F2, respectively, in terms of their real 
and imaginary parts. Specifically, to write out Fq, we first 
define the short notations = i?e($) and = /m(^>); 
then we have 



Fo 



-R- 



■ / 



■ - 



The expanded matrices Fi and F2 can be generated from Ei 
and E2 in a similar way, where the two terms in Ei or E2 
could first be expanded separately then summed together 
As such, problem (fT9] l can be equivalently rewritten as 

Min. PR tr(FoX) 

s.t. tr(FiX) > 1, tr(F2X) > 1, X ^ 0, 

rank(X) = 1. (20) 

The above problem is still not convex given the last rank- 
one constraint. However, if we remove such a constraint, this 
problem is relaxed into a convex SDP problem as shown 
below. 

Min. PR := tr(Fr,X) 

X ' 

s.t. tr(FiX) > 1, tr(F2X) > 1, X ^ 0. (21) 

Given the convexity of the above SDP problem, the optimal 
solution could be efficiently found by various convex optimiza- 
tion methods [25]. Note that if problem (I2TI 1 is infeasible, so 
is the more restricted problem ( |20l i. Thus, we assume w.l.o.g. 
that problem (ISTT i is feasible in the following discussions. SDP 
relaxation usually leads to an optimal X for problem (ISTT i 
that is of rank r with r > 1, which makes it impossible to 
reconstruct the exact optimal solution for problem (T% when 
r > 1. A commonly adopted method in the literature to obtain 
a feasible rank-one (but in general suboptimal) solution from 
the solution of SDP relaxation is via "randomization" (see, 
e.g., [28] and references therein). Fortunately, we show in the 
following that with the special structure in problem jTH . we 
could efficiently reconstruct an optimal rank-one solution from 
its optimal solution that could be of rank r with r > 1, based 
on some elegant results derived for SDP relaxation in [29]. In 
other words, we could obtain the exact optimal solution for 
the non-convex problem in ( l20t without losing any optimality, 
and as efficiently as solving a convex problem. 

Theorem 3.2: Assume that an optimal solution X* of rank 
r > 1 has been found for problem (l2Tl i. we could efficiently 
construct another feasible optimal solution X** of rank one, 
i.e., X** is the optimal solution for both (|20] | and ( 1211 1. 

Proof: Please refer to Appendix |V] ■ 

Since the above proof is self-constructive, we could write 
a routine to obtain an optimal rank-one solution for problem 
( |20] | from X*, as given in the last part of Appendix W\ 

IV. Low-Complexity Relay Beamforming Schemes 

In this section, we present suboptimal relay beamforming 
schemes that require lower complexity for implementation 



than the optimal scheme developed in Section |III1 Two subop- 
timal beamforming structures for A are proposed as follows: 
• Maximal-Ratio Reception and Maximal-Ratio Transmis- 
sion (MRR-MRT): 



H 



H 
DL 



OMR 

6mr 



H 



H 

UL' 



(22) 



Zero-Forcing Reception and Zero-Forcing Transmission 
(ZFR-ZFT)I 



DL 



azF 






&ZF 



H 



t 

UL- 



(23) 



Note that from ( fTOl i. it follows that and b^, x = MR 
or ZF, ax > and > 0, in the above beamforming 
structures play the role of balancing relay power allocations 
to transmissions from SI to S2 and from S2 to SI. MRR- 
MRT applies the "matched-filter (MF)" -based receive and 
transmit beamforming at R to maximize the total signal power 
forwarded to SI and S2. However, in this scheme, R does not 
attempt to suppress or mitigate the interference between SI 
and S2. On the other hand, ZFR-ZFT applies the "zero-forcing 
(ZF)" -based receive and transmit beamforming to remove the 
interferences between SI and S2 at R as well as at the end 
receivers of SI and S2. To_illustrate_ this, we substitute Azf 

2/2 (") 
2/1 (") 



in ( [23] ) into JTOl ) to obtain 



in the form of 



azF^/PiSi{n) 1 r azF 1 t ^ f„\ , \ ^^in) 
bzFVP^S2{n) J + [ bzF \ ^UL^i^W + [ 

It is observed from the above that the self-interferences are 
completely removed at the receivers of SI and S2 by ZF- 
based relay beamforming. Therefore, the main advantage of 
ZFR-ZFT over MRR-MRT lies in that it does not need to 
implement the self-interference cancelation at SI or S2, and 
thus simplifies their receivers. In general, with ANC, we know 
that the interference between SI and S2 observed at R is in 
fact the self-interference of S 1 or S2, and can be later removed 
at the end receiver of SI or S2. Thus, it is conjectured that 
MRR-MRT may have a superior performance over ZFR-ZFT 
for ANC-based TWRC. This conjecture is in fact true, and 
will be verified in later parts of this paper via performance 
analysis and simulation results. 

Interestingly, the above two suboptimal beamforming 
schemes both comply with the optimal beamforming structure 
given in ([T2l) . while their associated values of B are in 
general suboptimal. This can be easily verified by rewriting 



and Amr in 



as A 



MR 



Azf in 

Azf = U* BzfU^ , respectively, where 



U*BmrU" and 



Bmr 

BzF 



OMR 
^MR 

azF 
bzF 



VT, 



(24) 
(25) 



Using Lemmas 13.11 and 13.21 we can show the following 
results on the optimality of MRR-MRT and ZFR-ZFT in some 
special channel cases. For brevity, here we omit the proofs. 

^^Note that the ZFR-ZFT scheme with azF = &ZF has also been proposed 
in [18], but without detailed performance analysis. 
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Lemma 4.1: In both cases of /ii_L/i2 and hi || h2, Amk 
in (l22l) is equivalent to the optimal A given in (fTSI i. 

Lemma 4.2: In the case of ft,i_L/i2, ^zf in ( l23] l is equiva- 
lent to the optimal A given in ( fT2] i. 

It is also noted that in the case of hi \\ h2, -Bzf in dZSl ) does 
not exist since in S, <T2 = 0, and thus S is non-invertible. As 
a result, Azf does not exist either in this case. 

V. Performance Analysis 

To further investigate the performances of the proposed 
optimal and suboptimal relay beamforming schemes, we study 
in this section their achievable sum-rates in TWRC. First, we 
derive an upper bound on the maximum sum-rate or the sum- 
capacity achievable by the optimal beamforming scheme, as 
well as various lower bounds on the achievable sum-rates by 
the suboptimal schemes. Then, by comparing these rate bounds 
at asymptotically high SNR, we characterize the limiting sum- 
rate losses resulted by the suboptimal beamforming schemes 
as compared to the sum-capacity. 

A. Rate Bounds 

First, we study the sum-capacity of TWRC with given Pr, 
Pi, and p2- The sum-capacity of TWRC can be obtained by 
solving the WSRMax problem ( fT4l i with wri = W21 = 1. 
Since WSRMax for TWRC is non-convex and is thus difficult 
to solve, we consider an upper bound on the sum-capacity, 
which can be obtained by solving the following modified 
problem of (fT4l i: 

\glB2ig2\'^P2^ 



1 



+ 7; l0g2 



S.t. |lBi2t/i||V 



\Bggl\\' 



1 



\9^Bi2gi\^Pi 
\\B^29W + l. 

\B2ig2Vp2+Ki2tr{Bi2Bf^] 



K2itr{B2iB^i) < Pr 



(26) 



where K12 and K21 are nonnegative and satisfy K12 + K21 = 1- 
Let Csum(Ki2, K21) denote the maximum value of the above 
problem. Note that if we add the constraint B12 = B21 = B 
into the above problem, solution of B will lead to the exact 
sum-capacity of TWRC. Since Csum(Ki2, K21) is an upper 
bound on the sum-capacity for any feasible K12 and K21, it 
can be tightened by minimizing Csum(Ki2, K21) over all the 
feasible pairs of K12 and K2i. For given K12 and K21, problem 
(l26T l can be decomposed into the following two independent 
subproblems: 

Is-f -82192 IV 



rgx. i log2 ( 1 



\Bggt\\^ 



1 



s.t. ||J52lfif2irP2 + K2ltr(B21-Bfi) < P21 



(27) 



Max. - log, 
B,, 2 ^2 



l|Bf292lP + l. 

S.t. ||Bi2fifi||V+«i2tr(Bi2Bf2)<Pi2 (28) 

subject to an additional common constraint P21 + P12 < Pr- 
Let C2i{k21, P21) and Ci2(ki2,Pi2) denote the maximum 



values of the above two subproblems, respectively. Thus, 
C'sum('*i2 5 ^^21) can be obtained by first solving the above 
two subproblems for given Pi 2 and P21, and then maximizing 
C'2i(^C2i, P21) + Ci2('ii2, P12) over all the feasible values of 
P12 and P21. Note that each of the above two subproblems 
optimizes the relay beamforming matrix to maximize the 
capacity of the corresponding OWRC from S2 and SI via 
R, or from SI to S2 via R. By applying the results in prior 
work [20], [21], the optimal solutions to (|27] | and (|28] | can be 
obtained as 



B21 



B, 



P2I ~*~H 
12 , .. 9l92 



P2\\g2\\ + '«21 



P12 

929i 



pillsilr + '«i2 



(29) 
(30) 



where = ji^^i* = 1:2. By substituting the above expres- 
sions into the objective functions of dZTl ) and ( l28l ), respectively, 
we obtain 



C2I (k21 , P2I ) = - 10g2 ( IH fa /fl^^f^ 



1 



P2I ' 01P2I 

OiPi 



(31) 



(32) 



P12 ' e2Pi2 / 

where, for conciseness, we have denoted ||gi|p = ||/ii||^ = di 
and 1 1 92 IP = ll^2|P = 02- It then follows that the tightest 
upper bound on the sum-capacity, denoted as Cub, can be 
obtained as 

Cub = mill max C2i(k21, -P21) + Ci2(ki2, Pi2)- 

K21 + 1^12 = 1 P21+Pi2<Pr 

(33) 

Unfortunately, there is in general no closed-form solution 
of Cub, and thus numerical search over all the feasible values 
of K21J '*i2, -P21, and P12 is needed to obtain Cub- Since 
C2i('«21, -P21) and Ci2(ki2,Pi2) are increasing functions of 
P21 and Pi 2, respectively, a simple upper bound on the sum- 
capacity (less tighter than Cub) can be obtained from (ISTT i and 
with K21 = ni2 = 1/2 and P21 = P12 = Pr as follows; 



MO) 




1 



■ l0g2 1 



^iPl 



(9i/e2)pi 

Pr 



(34) 



2e2Pn 



Remark 5.1: Note that c[j"g given in ( l34b can be used as 
Pgum for Algorithm 13. II in Section |III] Since C|j"g is obtained 
without any constraint on rate allocations among r2i and ri2, 
it is a valid upper bound on the achievable sum-rate regardless 
of the rate-profile vector a.. 

Next, we derive the lower bounds on the sum-rates achiev- 
able by the proposed suboptimal relay beamforming schemes, 
MRR-MRT and ZFR-ZFT, denoted as R^^ and R^, respec- 
tively. Since the rate lower bound is of interest, we assume 
here = h^, where x = MR in (|22] | or ZF in (l23l l. For 
\h"h2? 

conciseness, define p = ' „ „ ' as the correlation coefficient 



between hi and ft,2- Then, the following lemmas are obtained. 

Lemma 5.1: With the MRR-MRT relay beamforming 
scheme, the achievable sum-rate of TWRC is lower-bounded 
by given in ( l35T l (see next page). 

Proof: Please refer to Appendix |VI] ■ 
Lemma 5.2: With the ZFR-ZFT relay beamforming 
scheme, the achievable sum-rate of TWRC is lower-bounded 
by i?LB given in ( l36l ) (see next page). 

Proof: Please refer to Appendix IVIII ■ 

B. Asymptotic Results 

Since the main advantage of TWRC over OWRC is to 
recover the loss of spectral efficiency due to half-duplex 
transmissions (see, e.g., [10]-[13]), it is important to examine 
the achievable sum-rate in TWRC at asymptotically high SNR. 
In the following theorem, asymptotic results on various upper 
and lower rate bounds in ( l34l ). ( |35] ). and (|36] | are presented. 

Theorem 5.1: Let pi,p2, and Pr all go to infinity with fixed 
If = Ki and If = K2. Then, Rfi, and RH converge 
to the values given in dJTl i. ( |38] |. and ( [39] l, respectively (see 
next page). 

It is observed from Theorem 15.11 that at high SNR both 
MRR-MRT and ZFR-ZFT (provided that p < 1) asymp- 
totically achieve the same sum-rate pre-log factor (sum-rate 
normalized by log2 Pr as Pr ~> oo) as that of the sum- 
capacity upper bound, C^^. However, they may have different 
rate gaps from C^^, which are constants independent of 
Pr. In order to gain more insights on the limiting sum-rate 
losses of suboptimal beamforming schemes, in the following 
corollary, we compare the difference between and 
or Plb asymptotically high SNR in a "symmetric" TWRC 
with equal channel gains, i.e., = II^.IP = S, and equal 

source and relay transmit powers, i.e., pi = P2 = Pr- In 
this case, we can obtain a tighter upper bound on the sum- 
capacity than Cy^B as follows. For the symmetric TWRC, 
with Ki2 = K21 = 1/2, it can be easily verified that the 
maximization over P12 and P21 in (|33] | is achieved when 
Pi 2 = P21 = Pr/2 and as a result a tighter upper bound 
over C'IJ'b for the symmetric TWRC is obtained as 

Corollary 5.1: At asymptotically high SNR, under the as- 
sumptions that Oi = O2 and Ki = K2 = 1, we have 

clfj^-Rll = iog2(Y^)- (42) 

It is noted that for < p < 1, J^^^a has the minimum 
value equal to at p = or 1, and the maximum value equal 
to 9/8 at p = 1/3. Therefore, from (Ell, it follows that the 
sum-rate loss of MRR-MRT from the sum-capacity is at most 
log2(9/8) « 0.1699 bits/complex dimension at asymptotically 
high SNR. On the other hand, it is observed from (|42] | that the 
sum-rate loss resulted by ZFR-ZFT increases with p, or when 



1.4r 




r^^ (bits/complex dimension) 

Fig. 2. Capacity region of the ANC-based TWRC with M = 4, Pi = 
P2 = Pu = 10, and p = 0.5. Note that the two rate regions enclosed by the 
dashed lines are example achievable rate regions TZ{pi,p2, Pjj)'s defined in 
Is), each with some fixed pi and P2, pi < Pi and p2 < P2- The achievable 
rate region denoted by A corresponds to pi = Pi and p2 < P2, while that 
denoted by B conesponds to pi = Pi and p2 = P2- 

hi and h,2 become more correlated. This is intuitively correct, 
since with the increasing channel correlation, more SNR loss 
will be incurred to separate the signals from/to SI and S2 at 
R by ZF-based receive/transmit beamforming. Also note that 
for MRR-MRT, at p = or 1, the sum-rate loss is zero, which 
is consistent with Lemma \4A] while the sum-rate loss is zero 
for ZFR-ZFT at p = 0, which is consistent with Lemma 14.21 

VI. Numerical Results 

In this section, we present numerical results on the achiev- 
able rates of various beamforming schemes considered in this 
paper, and compare them with those of other existing schemes 
in the literature. For convenience, we assume that hi is a 
randomly generated CSCG vector ^ CAf{0,I), and hi is 
normalized by its own vector norm such that \\hi\\ = 1. We 
then generate h,2 according to h,2 = \/phi-\-^/T^~ph.u,, where 
hu, is also a normalized CSCG random vector, \\h.uj\\ = 1 and 
hih^ ~ 0. Thereby, it can be easily verified that ||/i2|| = 1 
and ||/i;[^ft,2|P = P- It is assumed that AI = A in this section. 

A. Capacity Region of ANC-Based TWRC 

Fig. I2] shows the capacity region, C(Pi, P2, Pr) defined in 
for the ANC-based TWRC with Pi = P2 ^ Pr ^ 10, 
and p = 0.5. It is observed that C(Pi, P2, P/j) is symmetric 
over ri2 and r2i in this case. Notice that boundary rate- 
pairs of C(Pi, P2, Pr) are resulted by the union over those of 
achievable rate regions, TZ{pi,p2, PrYs, defined in ([S]), with 
different values of pi and p2, < pi < Pi and < P2 < P2- 
Boundary rate-pairs of each constituting TZ{pi,p2, Pr) are 
obtained by solving problem ( fTSl ) using Algorithm 13.11 with 
different rate-profile vectors ct's. It is observed that when 
pi = and p2 = P2, Tl{pi,p2, Pr) collapses into the 
horizontal rate axis of r2i, and the maximum value of r2i 
becomes the capacity of the OWRC with S2 transmitting to 
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dMR 1 1 / 1 , ^2P2 \ , 1 1 n , ^iPl \ /acN 



-Ph / (1+P)^ ei(l+p)Pji ' Pr J (1 + ^)2 ' e2{l+p)PR 

Rll = iog2(i + — — ^ : rl- (36) 



1 



02 P1+P2 



.(0) _ , ^ , ii ( ^ 

2 \{K2 + 62/ei){Ki+6J92) 



CZ = log2(^fl) + :^log2 ( , , It , . /. ^ ) +»(!) (37) 



^LB^ = I0g2(^'fl) + il0g2 ( ^^^^ — t|+0(1) (38) 

2 (i^2+Xl/A-2+02/^^l)(/Vl+i^2/i^l+el/02)%tM^ ' 

^LB = I0g2(^i?) + l0g2 I 7^ ^ , , I +»(!)■ (39) 



(1 + max(is:i, i^a) + max [K^jK^, K2/K1)) 



01+92 



1.4[ 




J I \ \ \ \ \ L_lJ ( 

0.2 0.4 0.6 0.8 1 1.2 1.4 

r^^ (bits/complex dimension) 

Fig. 3. Achievable rate regions of tlie ANC-based TWRC with M = 4, 
p^=p2 = 10, Pr = 10, and p = 0.1. 

SI via R. Similarly, TZ{Pi,0, Pr) collapses into the vertical 
rate axis of ri2, and the maximum value of ri2 becomes the 
capacity of the OWRC with SI transmitting to S2 via R. 

B. Achievable Rates of Suboptimal Beamforming Schemes 

Next, we examine the achievable rates of the proposed 
suboptimal relay beamforming schemes. Figs.[3]lll and|5]show 
the achievable rate region, Ti{pi,p2, Pr), for the TWRC with 
different values of p, p = 0.1,0.5, and 0.8, respectively. It is 
assumed that transmit powers at S 1 and S2 are fixed as pi = 
P2 = 10, and the relay transmit power constraint is Pr = 10. 
Three relay beamforming schemes are compared in each of 
these figures, which are the optimal scheme (Algorithm 13.1b . 
the MRR-MRT scheme and the ZFR-ZFT scheme ( |23] |. 
Note that boundary rate-pairs of Tl{pi,p2, Pr) corresponding 
to MRR-MRT are obtained by changing different ratios be- 
tween gmr and ^mr. in (l22] |. TZ{pi,p2, Pr) for ZFR-ZFT is 
obtained in a similar way. It is observed that the achievable rate 
region by MRR-MRT is very close to that with the optimal 



1.4r 



1.2 - 




Optimal 

MRR-MRT 

ZFR-ZFT 



q1 1 1 1 1 — I 1 l_Lj [ 

0.2 0.4 0.6 0.8 1 1.2 1.4 

r^j (bits/complex dimension) 

Fig. 4. Achievable rate region of the ANC-based TWRC with M = 4, 
pi = P2 = 10, Pr = 10, and p = 0.5. 
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0.2 0.4 0.6 0.8 1 1.2 1.4 

r^^ (bits/compiex dimension) 



Fig. 5. Achievable rate regions of the ANC-based TWRC with M = 4, 
Pj =P2 = 10, Pr = 10, and p = 0.8. 
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Fig. 6. Sum-rate versus system SNR for the ANC-based TWRC with M = 4 
and p = 1/3. 

scheme when the channel correlation coefficient p is either 
small or large, which is in accord with Lemma 14.11 Even 
for moderate values of p, e.g., p = 0.5, the rate loss of 
MRR-MRT is observed to be negligible, suggesting that MRR- 
MRT in fact performs very close to the optimal scheme under 
different channel conditions. In contrast, ZFR-ZFT performs 
close to the optimal scheme when p is small, which is in 
accord with Lemma Wl2\ This is due to the fact that when hi 
and /i2 ^"e sufficiently decorrelated, ZF-based recieve/transmit 
beamforming at R is able to suppress the UL/DL interference 
between SI and S2 with small SNR losses. However, as p 
increases, it is observed that the achievable rates of ZFR- 
ZFT degrade significantly as compared to those of the optimal 
scheme or MRR-MRT. 

In Fig. |6] we show the achievable sum-rate of TWRC with 
p = 1/3 versus the "system" SNR. Under the assumption that 
transmit powers at SI, S2, and R are all equal, i.e., pi = P2 = 
Pr, due to the unit-norm channels and unit-variance noises, 

the system SNR is conveniently set equal to Pr. Various sum- 

( s) 

rate bounds presented in this paper are shown, including C^g 
in (gOll, R^^ in and in (|36]l. In addition, the actual 
achievable sum-rates of MRR-MRT and ZFR-ZFT, denoted as 
jjMR ^jjjj jjZF^ respectively, are also shown for comparison. 
Note that due to the channel symmetry, omr and buR in (l22l i 
should be equal to maximize R^^^; thus, from the derivations 
in AppendixlVjit follows that R^^=R^^ for MRR-MRT On 



the other hand, for ZFR-ZFT, gzf and bzF in (l23l l should also 
be equal to maximize i?^^ in this symmetric-channel case; 
however, from Appendix I VIII it follows that even with ozf — 
bzF, ^LB < ™ general, where R^^ can be obtained from 
the RHS of (l59l l. We also show the sum-rates of the following 
two heuristic schemes: (1) Direct relaying, where the relay 
beamforming matrix is in the form of A ~ (I, with ( being a 
constant determined by Pr; (2) One-way alternative relaying, 
where four time-slots are used for one round of information 
exchange between SI and S2, with two for S2 transmitting 
to SI via R, and the other two for SI to S2 via R, and the 
corresponding optimal relay beamforming matrices are in the 



form of A21 = ^h\h2 and A12 = i-'h'^h^ , respectively, with 
xjj determined by Pr [20], [21]. We denote R^^ and 7?°^ as 
the achievable sum-rates of these two schemes, respectively. 

It is observed in Fig. |6]that at asymptotically high SNR, the 
sum-rate of MRR-MRT converges to the sum-capacity upper 
bound with a constant gap of 0.1699 bits/complex dimension, 
while ZFR-ZFT has a sum-rate gap of log2(l/(l — 1/3)) = 
0.5850 bits/complex dimension. The above observations agree 
with Corollary 15.11 It is also observed that the lower bound 
on the sum-rate by ZFR-ZFT, i?^^, is very tight at all SNR 
values. Notice that R^^ and i?*-*^ both have significant gaps 
from R^'^^ at asymptotically high SNR, since the former has 
no beamforming gain at R, and the latter roughly incurs a loss 
of half the spectral efficiency due to alternative relaying. 

C. Comparison with DF-Based TWRC 

At last, we compare the capacity region of ANC/AF-based 
TWRC derived in this paper with that of DF-based TWRC 
recently reported in [17], for the same physical TWRC. In 
order to differentiate the above two capacity regions, we 
denote the former as Caf and the latter as Cdf- Note that 
with DF relay operation, R first decodes both messages from 
SI and S2 as in the conventional Gaussian multiple-access 
channel (MAC) during the first time-slot; R then re-encodes 
the decoded messages jointly into a new message, and trans- 
mits it over the broadcast channel (BC) to both SI and S2 
during the second time-slot. Each of SI and S2 decodes the 
message of the other from the received signal given the side 
information on its own previously transmitted message (in the 
first time-slot). The achievable rates of S2 and SI during the 
first MAC phase can be expressed as [30] 



C^F^^(Fi,P2) = < (r2i,ri2) : r2i < log2 / + ^2/^2/^2 



H 



ri2 < log2 I + Pihih1 



r2i + ri2 < log2 



Pihih{ 



H 



P2h2h^ 



(43) 



The maximum achievable rate-pairs during the second BC 
phase can be expressed as [17] 

CEFiPn)^ U {(^2i,ri2): 

SR:SRtO,tr{SR)<PR 

r2i < log2 (1 + h^Snhfj , ri2 < log2 (l + h^Snh;^ | 

(44) 

where Su is the transmit signal covariance matrix at R. Note 
that in order to obtain Cqp(Pr) in (l44l i. we need to solve 
a sequence of optimization (WSRMax) problems expressed 
below with different nonnegative rate weights u'21 and wi2- 

Max. u;2i loga (l + h^SRh*^ + 11)12 loga (l + h^Snhl 



s.t. tr(5i^) < Pr, Sr h 0. 



(45) 



Since the above problem is convex, it can be solved by 
standard convex optimization techniques, e.g., the interior- 
point method [25]. Unlike the AF relay operation, DF relay 
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operation allows different time allocations between the MAC 
and BC time-slots. Let r and 1 — t denote the percentages 
of the total time allocated to MAC phase and BC phase, re- 
spectively. Then, combining both MAC and BC phases yields 
the capacity region for DF-based TWRC, Cbf{Pi, P2, Pr), 
expressed as 



CE^iPn. 



(46) 



t:0<t<1 



In Figs. Q and [8] we show ^Cq^^, f^op, 1--df. anu l-af 



Cdf, and Caf for 
p = 0.95 and 0.8, respectively. It is assumed that Pi = P2 = 
Pii = 100. Note that in each figure, Cdf can be visualized as 
the union of rate regions, each of which corresponds to the 
intersection of ^C^p'" and ^C§p after they are properly scaled 
by 2t and 2(1 — t), respectively, for a particular value of r. It 
is observed that the DF-based TWRC in general has a larger 
capacity region over the AF-based counterpart. Furthermore, it 
is observed that this capacity gain enlarges as p decreases, i.e., 
the channels hi and h2 become more weakly correlated. This 
is mainly due to the fact that when the UL channels become 
less correlated, R is more capable of decoding the messages 
from S 1 and S2 during the MAC phase and as a result, ^C^^'' 
is observed to get enlarged as p decreases. If we want to draw 
a more fair comparison between AF- and DF-based TWRCs 
with the same energy consumption, we may assume that for the 
DF-based TWRC, equal-duration time-slots are assigned to the 
MAC and BC phases, i.e., t = 1/2, the same as the AF case. 
As such, since for both p = 0.95 and 0.8, ^C^p''^ appears 
as a subset of |C§p, it concludes that the capacity region for 
DF-based TWRC with the fixed r 1/2 is simply ^C^^'^. 
Interestingly, it is observed that Caf improves over ^C^p^'^ 
when p = 0.95 in the region where the values of r2i and ri2 
are close to each other Notice that in this region the sum- 
capacity in the AF case is achieved. Since DF relaying incurs 
larger complexity for encoding/decoding at R as compared 
with AF relaying, AF relaying may be a more suitable solution 
in practice where strong channel correlation is encountered^ 
However, in the case of p ~ 0.8, it is observed that the capacity 
improvement of Caf over ^C^p'" diminishes. 

VII. Conclusion and Future Work 

This paper studied the fundamental capacity limits of 
ANC/ AF-based TWRC with multi-antennas at the relay. It was 
shown that the standard method to characterize the capacity 
region via WSRMax is not directly applicable to ANC-based 
TWRC due to the non-convexity of the optimization problem. 
Therefore, we proposed an alternative method to characterize 
the capacity region of TWRC by applying the idea of rate 
profile. As a byproduct, we also provided the solution for 
the relay power minimization problem under given SNR con- 
straints at the receivers. Due to the bidirectional transmission 
as well as the self-interference cancelation by ANC, we found 
that the design of relay beamforming in TWRC differs very 
much from the conventional designs for the OWRC or the 
UL/DL beamforming in the traditional cellular network. We 

'*Note that in the extreme case of p = 1, the multi-antenna TWRC becomes 
equivalent to the single-antenna TWRC studied in [3]. 



DF Capacity Region: MAC Phase 

- - DF Capacity Region: BC Phase 

DF Capacity Region 

AF Capacity Region 



r,, [bits/complex dimension) 



Fig. 7. Comparison of capacity region for ANC/AF-based versus DF-based 
TWRC with M = 4, Pj = P2 = = 100, and p = 0.95. 



- DF Capacity Region: l\/AC Phase 
DF Capacity Region: BC Phase 

- DF Capacity Region 
AF Capacity Region 



r,, (bits/complex dimension) 



Fig. 8. Comparison of capacity region for ANC/AF-based versus DF-based 
TWRC with M = 4, Pi = P2 = Pr = 100, and p = 0.8. 



presented the general form of the optimal relay beamforming 
structure in TWRC, as well as two low-complexity suboptimal 
schemes, namely, MRR-MRT and ZFR-ZFT. It was shown 
that ZFR-ZFT with the objective of suppressing the UL and 
DL interferences between SI and S2 may not perform well 
in the case of strong channel correlation, while MRR-MRT 
with the objective of maximizing the total forwarded signal 
power from R to SI and S2 achieves sum-rates and rate 
regions close to the optimal ones under various SNR and 
channel conditions. This suggests that MRR-MRT can be a 
good solution from an implementation viewpoint. It was also 
shown that the ANC/AF-based TWRC can have a capacity 
gain over the DF-based TWRC for sufficiently large channel 
correlations and equal MAC and BC time-durations. 

Future work beyond this paper may include the joint design 
of source and relay beamforming when each source is also 
equipped with multi-antennas, the relay beamforming design 
for more than one source-pairs with different combined uni- 
cast/multicast transmissions, and the design of a hybrid AF/DF 
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scheme that probably improves the performances of both AF- 
and DF-based TWRCs. In addition, the study of estimate-and- 
forward (EF) relay operations for the multi-antenna TWRC is 
also appealing. 

Appendix I 
Proof of Theorem I3.1I 
Without loss of generality, we can express A as 

B C 



A 



where U 



D E 



U*BU" + U*C{U 

+ {U^)*E{U^f 



H 



{U^)*DU" 



(47) 



(48) 

H 



I ~ uu 

and B, C, D, and E are complex matrices of size 
2 X 2, 2 X (M - 2), {M - 2) x 2, and {M - 2) x 
{M — 2), respectively. First, it can be shown that in 
©, \hjAh2\^ = \h'^U*BU"h2\\ and \\A"hl\\^ = 
\\B"U^hl\\^ + \\C"U^hl\\^ > \\B"U^hl\\\ Thus, it 
follows that r-zi does not depend on D and E, and is 
maximized when C = 0. Similarly, from (|7]l, we can show 
that ri2 is also not related to D and E, and is maximized 
when C = 0. Next, for the relay power constraint ([8]l, from 
(|3]l it can be shown that pn is minimized when C, D, and 
E are all equal to 0. Since each rate-pair on the boundary 
of TZ{pi,p2, Pr) defined in (jS) must maximize and ri2 
subject to the given Pr, it concludes that all C, D, and E 
in the corresponding A should be 0. Thus, from (|48] l. we 
conclude that A = U* BU" . 

Appendix II 
Proof of Lemma IITI 

In the case of hi±h2, it can be easily shown that Qi = 



[\\hi\\,Of and 92 = [0,\\h2\\f- Let B 
tuting and §2 into (T3[ yields 



^-21 < ^ l0g2 



?'12 < 2 log2 



1 



\\h,r\\h2r\cfp2 

\hiP{\a\' + \c\^) + 

\\h^n h2r\d\'pi 

\h2P(\b\ 



h^fi\af + \df) + \\h2fi\cf + \bf) 



MP < Pb.- 



Substi- 

(49) 
(50) 

(51) 



It then follows that r2i and ri2 are maximized along with 
the relay transmit power being minimized when |a| = and 
|6| = and, thus, a = b = 0. Since in the above rate and 
power expressions only |cp and jdp are involved, we can 
assume w.l.o.g. that c > and d> 0. 

Appendix III 
Proof of Lemma [372] 

In the case of hi \\ h2, it can be easily shown that = 
[||/ii||,0]^andg2 = [||^2||,0]^. LetB= [ I ]. Similarly 
like the proof of Lemma 13.11 in Appendix [III by substituting 
Qi and g2 into ( fT3] l, it follows that r2i and ri2 are maximized 
along with the relay transmit power being minimized when 
& = c = d = 0, and we can assume w.l.o.g. that a > 0. 



Appendix IV 
Proof of Convergence of Algorithms. II 



In this appendix, we prove that Algorithm B. ll guarantees the 
convergence of 7',nin to the optimal solution of problem ( fTSI ). 
First, we show that Vnun is a feasible solution of problem ( fTSl i: 
Given i?sum = ''min, from Algorithm 13. II it is easily verified 
that all the three constraints of problem ( fTSl l are satisfied. 
Secondly, suppose that there exists another feasible solution 
f for problem dTSI l such that f > Tmin + Sr (note that 6r 
can be chosen arbitrarily small in Algorithm 13.1b . However, 
this contradicts the fact that rmax, ''max < ?'min + < 



f, has been proven in Algorithm 13.11 to be an infeasible 
solution of problem ( fTSl l since the required minimum power, 
p|j, is larger than the given constraint Pr in problem ( fTSI ). 
Therefore, by contradiction, it follows that there does not exist 
such a feasible solution f for problem ( fTSl ). From the above 
discussions, it concludes that the feasible solution r,„in is at 
most 6r lower than the optimal solution of problem ( fTSb . By 
letting 5r 0, convergence of Algorithm 13.11 is thus proved. 

Appendix V 
Proof of TheoremI3.2I 



Given X*, first we know that at least one of the two 
inequality constraints in ( |2T] ) is active at the optimal point, 
i.e., we have either tr(FiX*) = 1 or tr(F2X*) = 1, 
or both. This fact can be proved by contradiction: If at X* 
both tr(FiX'^) > 1 and tr(F2X*) > 1 hold, we could 
always find a t with < t < 1 such that Y* = tX* and 
min(tr(Fil^*),tr(F2l^*)) = 1. We could easily see that 
tr(FoF*) < tr(FoX*), which means that X* could not be 
the optimal solution, i.e., contradiction holds. 

From now on, we assume w.l.o.g. that tr(i^iX*) = 1 such 
that we have tr((F2 - > 0. To facilitate the proof 

for Theorem 13.21 let us first give the following lemma, which 
is based on Lemma 1 given in [29], and the proof also follows 
a similar way to that in [29] (so it is skipped here). 

Lemma 5.1: Given that tr((F2-Fi)X*) > 0, there exists 
a decomposition for X* such that 

r 

X* = ^ Xixf 

1=1 

and xf{F2 - Fi)x, > for alH = 1, . . . , r. 

Based on the above lemma, let yij = xjFiXj, i = 0, 1, 2, 
and j — 1, . . . ,r. Now consider the following linear program 

r 



s.t. ^yijij > 1, ^y2jtj > 1 

> 0, j = l,...,r. (52) 

We see that for any feasible set of ti, . . . , t,, such that all the 
inequality constraints are satisfied, X = Si=i^j(^j^J) 
a feasible solution for the SDP problem ( |21] |. As such, the 
minimum objective value of the above linear program is same 
as that of the SDP problem (ISTT i. and one such an optimal 
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T 

1 ^j^j 



1 (which corresponds to X = 
X*). Note that the optimal 



point is ti = ... = tr 

X]j=i ^ji^j^j ) ~ ^ 

points may not be unique 

Furthermore, given that xf{F2 ~ Fi)xi > for all i ~ 
l,...,r from Lemma ISTl we have y2j > yij for all j's. 
Therefore, Vij^j — 1 implies J2j=i V^j^j — 1' i-^-' '^he 

second inequality constraint in (|52] | is redundant. Thus, 
can be recast as 



Let the relay transmit power pj^ in (O be equal to the 
maximum value Pr. Using the above equalities, from ^ it 
follows that 

01^2(1 + 3p)(0lPl + + 20102(1 + P) ■ 

Substituting ( |58] ) into the above equalities, and from ^ and 
(Q, the lower bound on the sum-rate given in (|35] l foUows. 



Min. 

ti,...,tr 



s.t. Yyijtj>l 

> 0, j = l,...,r. 



(53) 



When X* can be found for the SDP problem (l2Tl i. it 
means that the optimal objective values for both the SDP prob- 
lem ( l2n i and the linear program problem (l53T l are bounded, 
which implies that (l53T l must have one basic optimal feasible 
solution, at which at least r inequality constraints are active (to 
define an optimal vertex point in the feasible region). Since we 
only have r + 1 inequality constraints in (|53] |, at most one tj is 
positive. Actually, we have exactly one tj positive; otherwise, 
all zero tj's could not be a feasible solution. At such a basic 
optimal feasible solution, if we have > and tj ~ 0, j ^ k 
with 1 < j, k < r, we could infer that there exists an optimal 
rank-one solution for the SDP problem (l2Tl i. which could be 
constructed as 

X** = tUxkxl). 



This completes the proof for Theorem 13.21 
At last, we present a routine to obtain an optimal rank-one 
solution for problem (|20] | from X* as follows: 

1) Decompose X* in reference to F2 — Fi as in 
Lemma 15.11 (For detailed procedure, refer to the proof 
for Lemma 1 in [29]). 

2) Construct the linear program problem as shown in ( 1531 ). 
and solve one basic optimal feasible solution. Such an 
algorithm could be based on solving r parallel sub- 
problems, where at each sub-problem only one tj is 
allowed to take non-zero values. Then the achieved 
minimum objective values from the r sub-problems are 
compared to find the global minimum solution. 

3) Given the single optimal positive t^, the rank-one op- 
timal solution for both ( l20l ) and (|2TI) is constructed as 
X** = tl{x,xl). 

Appendix VI 
Proof of Lemma IsTT] 

Let aiviR = bMR = in Amr given by ( |22] |. We can then 
show the following equalities: 



hiAMRh2\ = \hi AuRhil = 1/0102(1 + p) (54) 

hlW' = Mmr/iiIP = iy^0l02{l + 3p) (55) 

„Amr/i2|P = J^'0i02(l + 3p) (56) 

tr(AMRASR) = 2z/20i02(l + p). (57) 



I^MR"2l 



Appendix VII 
Proof of Lemma [5T2] 

Let azF = bzF = in A^f given by 



Denote H 



[ai, 02]^. From (|6]l and (|7]i, we can show that 



UL 



R 



ZF 
LB 



> 



> 




P1+P2 



(59) 



(60) 



a2|rPi + ||ai|rP2 ■ 

where ( |60] l is due to the Jensen's inequality (see, e.g., [30]) 
and the convexity of the function f{x) = log2(l + l/a;), x > 
[25]. Let the relay transmit power pu in (O be equal to the 
maximum value Pr. Then, we obtain from (O that 

^2 ^ Pr 

\a,Pp2 + tr [H\,^iH\jJH^Hl,JHHl,^ 

(61) 

(62) 



a2||^Pi 



> 



\a2Pp1 



\ai\\^P2 



\air + \\a2rr 
where (|62]) is due to the fact that tr{XY) < tr(X)tr(F), 
if X ^ and Y ^ [31]. Using the term inside log2(-) 
in ( l60b can be further lower-bounded by 

2piP2 



pi 

Pr 



^)(II«2|Ppi 



aill P2j 



P1+P2 I 



0,2 



\2\2 



1+B2 



Since it can be shown that ||ai|p + ||£i2|P = eie-ili-p) ' 
substituting this equality into the above equation yields the 
lower bound on the sum-rate given in ( |36] |. 
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